Goals and Inspiration

A Reddit post in April 2020 asked a question that strikes at the tricky issue of justifying a serious critique of the 45th President of the United States.

Is Trump really as bad as Left wing news sources make him out to be?

For many, the answer seems to burst forth from a very emotional or single-issue platform. I found that I couldn’t answer the question in a way that I would respect if I heard it in opposition.

Luckily, there are many journalists who have devoted untold amounts of time to creating the kind of factual backdrop that would allow one to refresh themselves on how their emotional positions came to be. For this project, to begin with, I pulled exclusively from the amazing work of Ben Parker, Stephanie Steinbrecher and Kelsey Ronan at McSweeney’s. In preperation for the 2018 Midterms, they compiled a list of every factual event that they felt constituted a breach of the acceptable bounds of Presidential behaviour. While the piece is partisan, I was impressed with the dry cut presentations of these trangressions - it did not feel liberal to point out that in 2011, Donald Trump told Bill O’Reilly that President Obama did not have a birth certificate. And so in great categorical detail, these journalists listed 546 similar points of fact categorized by general domains.

I was incredibly impressed with the impact of this list, but felt it could be added by a graphical display. This page is an attempt to parse the list and repackage it into said display.

Web Scraping

To accomplish the task in R, the following packages were required.

library(rvest)
library(dplyr)
library(tidyr)
library(purrr)
library(ggplot2)
library(ggthemes)
library(plotly)

The following code pulls the key factors from the article: * The description of the image of the colored dot representing the category of the list item * The text of the list item + The date of the list item, which is pulled out from the text Due to inconsistencies in the formatting of the web page, there are some list items that are not formated with the identical hyphens that are required by my parsing method. As a result, these items must be dropped, which is accomplished by retaining only items that returned a proper Date value. As a result, we sadly lose 40 list items. 506 remain.

HTMLRead <- read_html("https://www.mcsweeneys.net/articles/the-complete-listing-so-far-atrocities-1-546")

key1 <- html_nodes(HTMLRead, "ol li")
key2 <- list(NULL)
for(i in 1:length(key1)){
  key2[[i]] <- paste(xml_attrs(xml_child(key1[[i]]))[1])
}
key3 <- as.data.frame(cbind(key2))

           
text1 <- HTMLRead %>% 
  html_nodes("ol li") %>%
  html_text() %>%
  as.data.frame()

text2 <- separate(text1, col = .,sep=" – ", into=c("Delete1", "Date","Text"),extra = "merge")
text2$Date <- as.Date(text2$Date, format = "%B %d, %Y")

data1 <- cbind(key3, text2)

data2 <- subset(data1,!is.na(Date))

We now conditionally transform the image files representing the category indicators into factors that describe each list item, and then retain list items where the process was succesful, bringing the total count down to 503.

for (i in 1:nrow(data2)){
data2$Category[i] <- if (grepl("red",data2$key2[i])){"Sexual Misconduct"} else
                  if (grepl("lightblue",data2$key2[i])){"Public Statements"} else
                  if (grepl("yellow",data2$key2[i])){"Collusion or Obstruction of Justice"} else
                  if (grepl("darkpurple",data2$key2[i])){"Staff or Administration"} else
                  if (grepl("pink",data2$key2[i])){"Family Business"} else
                  if (grepl("orange",data2$key2[i])){"Policy"} else
                  if (grepl("green",data2$key2[i])){"Environment"} else{"White Supremacy"}
}

data3 <- data2[,-(1:2)]
data3 <- data3[complete.cases(data3),]

Preparing Data for Graphing

There are 2820 days between the first and last list item. To summarize for easier presentation, I want to reduce the number of points by reducing all dates to be simply the first day of the month-year they are in. Additionally, I would like to add some line breaks to the text, so that they will be readable on a single chart.

monthStart <- function(x) {
  x <- as.POSIXlt(x)
  x$mday <- 1
  as.Date(x)
}

data3$Month <- monthStart(data3$Date)
data3 <- data3[!duplicated(data3[,c("Text", "Month")]),]
data3$Event <- gsub('(.{1,90})(\\s|$)', '\\1\n',data3$Text)

And… Voila! Making use of Plotly’s amazing graphics features, the following plot is highly interactive. By hovering over the plot, details about the event are highlighted. Addtionally, double clicking on a category isolates that category within the timeline.

I aim to include data from additional sources soon. The list grows.